Forgetting Counts: Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process
نویسندگان
چکیده
We propose a novel dependent hierarchical Pitman-Yor process model for discrete data. An incremental Monte Carlo inference procedure for this model is developed. We show that inference in this model can be performed in constant space and linear time. The model is demonstrated in a discrete sequence prediction task where it is shown to achieve state of the art sequence prediction performance while using significantly less memory.
منابع مشابه
Parallel Markov Chain Monte Carlo for Pitman-Yor Mixture Models
The Pitman-Yor process provides an elegant way to cluster data that exhibit power law behavior, where the number of clusters is unknown or unbounded. Unfortunately, inference in PitmanYor process-based models is typically slow and does not scale well with dataset size. In this paper we present new auxiliary-variable representations for the Pitman-Yor process and a special case of the hierarchic...
متن کاملA Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain Adaptation
In this paper we present a doubly hierarchical Pitman-Yor process language model. Its bottom layer of hierarchy consists of multiple hierarchical Pitman-Yor process language models, one each for some number of domains. The novel top layer of hierarchy consists of a mechanism to couple together multiple language models such that they share statistical strength. Intuitively this sharing results i...
متن کاملA parallel training algorithm for hierarchical pitman-yor process language models
The Hierarchical Pitman Yor Process Language Model (HPYLM) is a Bayesian language model based on a nonparametric prior, the Pitman-Yor Process. It has been demonstrated, both theoretically and practically, that the HPYLM can provide better smoothing for language modeling, compared with state-of-the-art approaches such as interpolated KneserNey and modified Kneser-Ney smoothing. However, estimat...
متن کاملA Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction
In this work we address the problem of unsupervised part-of-speech induction by bringing together several strands of research into a single model. We develop a novel hidden Markov model incorporating sophisticated smoothing using a hierarchical Pitman-Yor processes prior, providing an elegant and principled means of incorporating lexical characteristics. Central to our approach is a new type-ba...
متن کاملBayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling
In this paper, we propose a new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference. Our model is a nested hierarchical Pitman-Yor language model, where Pitman-Yor spelling model is embedded in the word model. We confirmed that it significantly outperforms previous reported results in both phonetic transc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010